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Frequency warping in sinusoidal coding 

Bert den Brinker 



Audio coding schemes which are based on sinusoidal coding employ a certain segment 
Size, or multiple segment sizes (multiscale models) for the estimation of the sinusoidal 
parameters and the extraction of the associated components. In a one-scale model, the 
time-frequency resolution trade-off setting is a major determinant in the final quality bit- 
rate compromise while, for multiscale models, problems arise due to the scattering of 
components over scales and the consequent parameter merging process. To overcome these 
problems it is proposed to use a single-scale ^frequency-warped sinusoidal estimation mech- 
anism where the warped frequency scale resembles that of the human ear. 

Things known so far 

The common approach in sinusoidal coding of audio is to segment the signal and estimate 
the sinusoids within that segment, e.g. [1, 2, 3, 4, 5], This gives problems with the required 
time-frequency resolution trade-off, especially for high-quality audio coding where a large 
frequency range is necessary. Therefore, multiscale models have been proposed [6 7 7], but 
these bring about the problem of scattering of components over scales and/or of merging 
the data retrieved at different scales. 

For LPC-coding of audio, it has been suggested to work in the warped frequency domain 
[8, 9, 10, 11, 12], where the warping is related to perceptually relevant scales, e.g., the Bark 
or ERB scale [13, 14]. 

The problem for which this invention brings the 
solution 

The problem is either the time-frequency resolution in single scale model or the co-operation 
of the different scales (realised in parallel paths) in a multi-scale sinusoidal estimation/extraction 
mechanism. 
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Proposed measures 

In order to obtain a sinusoidal estimation mechanism with a time-frequency resolution 
close to that employed in the human ear, it is proposed to use the warping technique. 



Embodiment 

In current sinusoidal coders, a tapped-delay-line is used to define a segment of data which is 
input to the sinusoidal analysis module. Next, this data is analysed for sinusoidal content, 
typically the data is windowed and Fourier transformed to detect the relevant sinusoidal 
components. 

Instead of a tapped-delay-line, a first-order allpass section can be used to replace each 
delay. Taking the Fourier transform of the outputs of the allpass section, we obtain the 
Fourier transform on a frequency-warped scale. The sinusoidal extraction can be done as 
usual. 

After sinusoidal extraction, the subsequent processing stage is residual modelling. The 
cheapest way of residual modelling is probably using a parametric model for the power 
spectral density functions. We note here that such an approach allows the integration of 
sinusoidal and noise estimation since, for noise modelling, warped LPC [8, 9, 10, IX, 12] or 
warped ARMA modelling (according to PHNL00028T and PHNL0002S8' ) can be 
used. 

Application areas 

Audio and speech coding. 
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Figure 1: Usual sinusoidal analysis. The signal x is input to a delay line. The content of 
the delay line is input to a downs ampler: once per D samples the input is passed to a the 
sinusoidal analysis SA. Typically, the consecutive segments that are input to SA have overlap. 
The output of the mechanism SA are the sinusoidal parameters describing the signal. 
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Figure 2: Warped Fourier transform by using allpass section A(z) followed by a downs ampler 
D. The output samples are input to mechanism for sinusoidal analysis SA. The output of this 
mechanism are the sinusoidal parameters describing the signal In view of the warping oper- 
ation, a pre-filtering operation can be applied to x for (partial) amplitude and/or amplitude 
phase-compensation of the allpass line. 
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Fig. 3 

Fig. 3 shows an embodiment of the invention. An audio and/ or speech signal Al is furnished 
to a parametric encoder and coded into an encoded audio and/or speech signal A2. The 
encoded signal A2 is transmitted over a communication channel or stored on a storage 
medium. A parametric decoder obtains the encoded signal from the communication channel 
or storage medium and decodes this signal A2 into a decoded audio and/or speech signal Al ' 
which is a representation of Al « The parametric encoder according to this embodiment of the 
invention estimates sinusoidal parameters on a frequency warped scale. The estimated 
sinusoidal parameters are included in the bit-stream A2 and transmitted to the decoder. In the 
decoder, on the basis of these sinusoidal parameters which have been estimated on a 
frequency warped scale, a reconstruction of the original audio signal is made: Al \ 
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CLAIMS: 



1 . A parametric coding method of encoding an audio (and/ or speech) signal, 
which method comprises the steps of estimating sinusoidal parameters and extracting 
associated components, wherein the estimating step is performed on a frequency warped 
scale. 

2. A parametric encoder for encoding an audio (and/ or speech) signal, which 
device comprises means for estimating sinusoidal parameters and extracting associated 
components, wherein the estimating step is performed on a frequency warped scale. 

3. A parametric decoding method of decoding an encoded audio (and/ or speech) 
signal, which method comprises the step of receiving the encoded audio signal which 
includes sinusoidal parameters which have been estimated on a frequency wazped scale, and 
using said sinusoidal parameters in the reconstruction of an audio signal. 

4. A parametric decoder for decoding an encoded audio and/ or speech si gnal , 
which decoder comprises means for receiving the encoded audio signal which includes 
sinusoidal parameters which have been estimated on a frequency warped scale, and means for 
using said sinusoidal parameters in the reconstruction of an audio signal. 

5. An encoded audio and/ or speech signal, which signal includes sinusoidal 
parameters which have been estimated on a frequency warped scale 
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